feat(agent-email): add GET /v1/email/init readiness preflight (#1795)#1813
feat(agent-email): add GET /v1/email/init readiness preflight (#1795)#1813kovtcharov wants to merge 23 commits into
Conversation
DocumentQAAgent and RoutingAgent were the last two agents left in the core source tree under src/gaia/agents/. They now ship as standalone gaia-agent-docqa / gaia-agent-routing wheels under hub/agents/python/, completing the "strip src/gaia/agents/ to framework only" goal for #1102 (only base/, tools/, registry.py, builder/ — plus the chat family and email — remain in core). docqa is a building-block RAG agent: it registers via the gaia.agent entry point as a hidden agent (mirroring fileio), default model Qwen3.5-35B-A3B-GGUF. routing is infrastructure — a meta-agent loaded by class path from the OpenAI API server, not a registry agent — so it ships without a gaia.agent entry point; gaia.api.agent_registry now resolves it at gaia_agent_routing.agent.RoutingAgent and fails loudly with an install hint when the wheel is absent.
Self-review follow-up to the docqa/routing migration: the gaia-agent-code CLI imported RoutingAgent from the old in-tree path (gaia.agents.routing.agent), which the migration broke. Repoint it at gaia_agent_routing.agent and declare gaia-agent-routing as a dependency of gaia-agent-code, since the `gaia-code` query path routes through RoutingAgent for language/project-type detection. No reverse dependency (routing → code) — routing resolves CodeAgent through the registry at runtime, avoiding a cycle. Also clears the now-dead RoutingAgent allowance in the agent-conventions checker (it only applied while routing lived under src/gaia/agents/).
# Conflicts: # hub/agents/python/docqa/tests/test_docqa_agent.py
# Conflicts: # .github/workflows/test_gaia_cli.yml # setup.py
Merging main surfaced three stale references the migration missed: - test_default_max_steps imported the now-migrated gaia.agents.docqa; repoint it at the core BuilderAgentConfig, which exercises the same field(default_factory=default_max_steps) inheritance. - test_agent_pypi_publish asserted every published wheel declares a gaia.agent entry point, but routing is infrastructure loaded by class-path and intentionally ships without one. Exempt it explicitly. - Routing module path + source links in the docs still pointed at src/gaia/agents/routing; repoint to the gaia_agent_routing wheel. Also preserve the original traceback on the gaia-code ImportError re-raise (raise ... from e) now that the block is being edited.
gaia-agent-code now depends on gaia-agent-routing>=0.1.0, which isn't published to PyPI. The Test Code Agent workflow installed code straight from the hub dir, so uv tried to resolve routing from the registry and failed. Install the local routing package first so the dep resolves locally. End users are unaffected — both wheels publish together on tag.
The API streaming tests target the 'gaia-code' model, which routes through RoutingAgent. Pre-migration routing lived in core, so it resolved automatically; now it ships as the gaia-agent-routing wheel that the API Tests job didn't install — so 3 streaming tests hit the (correct) missing-wheel error instead of a real agent. Install the local routing+code hub packages, and re-run API tests when either hub package changes.
CLAUDE.md still pointed DocumentQAAgent/RoutingAgent at the old
src/gaia/agents/{docqa,routing} locations and listed docqa in the source
tree — stale after the hub migration and misleading since CLAUDE.md loads
as context on every session. Point both at their hub wheels and drop the
docqa tree entry.
errors.py FRAMEWORK_PATHS carried a dead 'gaia/agents/routing' entry; the
wheel's frames are already filtered by 'site-packages/'. Remove it and
update the test that asserted its presence.
A fresh host passed /health and /version green even with Lemonade down and no model downloaded, then 502'd on the first triage call. /v1/email/init probes the whole triage stack — Lemonade reachable + the triage model present — and returns a structured status (200 when ready, 503 when not, with an actionable hint), so integrators can verify "ready to triage," not just "process up." Read-only: probes only, no model pull or provisioning. `loadable` is reported null in v1 (forcing a load is heavy); `present` is the readiness signal. The Lemonade reachability probe reuses the existing short-timeout /health logic (#1677) via a shared base-URL resolver. Docs kept in sync: the runtime /v1/email/spec page, the hand-maintained specification.html, and the regenerated openapi.email.json. Verified the route also mounts via the frozen sidecar's build_app().
Review: PR #1813SummaryThe code here is solid and well-tested, but the PR is mistitled and under-described: it's titled Issues Found🟡 Title and description describe #1795, not the change being reviewed
🟢 🟢 🟢 Strengths
VerdictRequest changes — no code defect blocks merge; the blocking item is purely the 🟡 title/description mismatch (and the rebase to drop the already-merged #1795 diff). Fix those and this is an approve. The two remaining 🟢 items are follow-ups, not gates. |
build_app()'s app.routes mixes APIRoutes with mounted _IncludedRouter objects, which have no .path attribute — iterating it raised AttributeError. Read .path defensively and additionally prove the route is reachable through the sidecar app with a real request (503, not 404).
Newer FastAPI keeps included routes under a mounted sub-router rather than flattening them into app.routes, so a .path scan can't find /v1/email/init even though it is served. Assert reachability with a real request (503, not 404) — version-robust proof the sidecar app mounts it.
…layground Install & setup now drives provisioning via the API instead of copy-paste: a 'Run gaia init' button POSTs /v1/email/init and streams the output into a terminal panel (line by line, tolerant of SSE or plain-text framing), with a running/ok/failed status and an auto health-recheck on success. Built to the contract the /init PR (#1813) will serve — GET = readiness, POST = provision; until it lands the button reports the endpoint as unavailable. The manual steps remain below, and the CLI hint is now 'gaia init --profile email'.
#1795) GET /v1/email/init tells you the triage stack isn't ready, but a frozen-binary sidecar had no way to *fix* it. POST /v1/email/init is the provisioning companion: it tells a running local Lemonade to download the configured email model and streams newline-delimited (text/plain) progress so a consumer (the #1814 playground) can render it terminal-style, line by line. A ✓-prefixed final line means success, ✗ means failure. Scope is the frozen-binary reality: the sidecar can't run the full `gaia init` or install Lemonade itself (chicken-and-egg). If Lemonade is unreachable the verb returns a real 503 with an actionable line and pulls nothing; once a pull starts the response is a committed 200 (HTTP status can't change mid-stream), so the trailing ✓/✗ line is the authoritative outcome. The pull posts only `model_name` (no `recipe`) for the built-in email model — the #1655 trap. GET behavior is unchanged. POST is a streaming operational verb (like GET /spec), so it's kept out of the JSON OpenAPI and documented in the HTML spec (spec_html.py + specification.html) instead.
| return StreamingResponse(_unreachable(), media_type=media_type, status_code=503) | ||
|
|
||
| return StreamingResponse( | ||
| _provision_progress(probe_base, model_id), |
… presence (#1795) GET /v1/email/init said "ready" as long as Lemonade was up and the model was downloaded — even against a Lemonade too old to run the triage stack, which then fails at request time. Readiness now also checks the server VERSION: it reads Lemonade's self-reported version from /health and compares it to the agent's required minimum, so "ready" means "ready to triage," version included. The lemonade block gains found-vs-required fields the playground renders: lemonade: { reachable, base_url, version, min_version, compatible } A too-old server → ready=false (503) with an actionable upgrade hint ("Lemonade x.y.z is older than the required a.b.c — upgrade …"). An unadvertised/unparseable version is reported compatible=null and does NOT block (mirrors gaia init's don't-block-on-unparseable policy). Single source of truth: min_lemonade_version lives in gaia-agent.yaml (the manifest `gaia init` reads) AND as gaia_agent_email.version.MIN_LEMONADE_VERSION (the RUNTIME value — the frozen sidecar bundles neither gaia.installer nor the yaml, so the check can't read them at run time). A lock-step test fails if the two drift. The version-parse helper mirrors InitCommand._parse_version locally for the same frozen-binary reason. /health stays liveness-only; POST /v1/email/init (provisioning) is unchanged.
…e triage model) The frozen email sidecar can't run the full installer, so `gaia init` is the host-side path that downloads and version-checks the email triage model. Adds an "email" init profile (Gemma-4-E4B-it-GGUF) and exposes `gaia init --profile email` as a CLI choice. Its min_lemonade_version is held in lock-step with the email agent's runtime MIN_LEMONADE_VERSION (the same minimum GET /v1/email/init enforces), so the installer and readiness can't disagree on what 'compatible' means — a test asserts they match.
# Conflicts: # CLAUDE.md # hub/agents/python/email/gaia_agent_email/spec_html.py
|
🟡
Every other blocking route in the file already uses # Option A — simplest: wrap the entire generator in run_in_executor
import asyncio
from starlette.concurrency import iterate_in_threadpool
@router.post("/init", include_in_schema=False)
async def email_provision() -> StreamingResponse:
...
return StreamingResponse(
iterate_in_threadpool(_provision_progress(probe_base, model_id)),
media_type=media_type,
status_code=200,
)
|
…d#1796) (amd#1814) A developer evaluating the email agent has no zero-setup way to see it work — clone the repo, build the package, run a CLI. This adds a GAIA-styled page the **sidecar serves itself** at `http://127.0.0.1:8131/v1/email/playground`: visit it and you get a **stack-health check** (sidecar up + a plain-language Lemonade/model diagnosis — *"Lemonade not found"*, *"model not downloaded"*), **live triage and draft** against the running sidecar, a button that **exercises the `/v1/init` readiness endpoint**, and copy-paste install shortcuts.  **Localhost-only is structural, not a promise.** The page is served same-origin (no CORS, no remote-controlled JS) and the route ships `Content-Security-Policy: connect-src 'self'`, so the browser *refuses* any non-local fetch — email content can't leave the machine. Inference stays on local Lemonade. The `/init` button consumes the readiness endpoint from **amd#1795**, implemented in **PR amd#1813** (branch `claudia/task-4a1065f9`). This branch predates it, so `/v1/email/init` returns 404 here — the button **fails loudly with a clear message** ("update the sidecar — ships with amd#1795") rather than breaking, and lights up once amd#1795 merges. The endpoint is **not** duplicated here; the playground only consumes it. Closes amd#1796. ### Also in this PR - **Added a "Playground" section + screenshot to the email agent README** (`hub/agents/python/email/README.md`), mirroring the npm package's architecture-diagram embed. - **Brought the sibling `/v1/email/spec` page on-brand.** It used an off-brand orange/blue/green palette; restyled to the GAIA dark+gold tokens (matching the website + playground), self-contained (system fonts, no webfont), and added a "Convenience pages" section listing `/spec` and `/playground`. ## Test plan - [ ] `PYTHONPATH=hub/agents/python/email python -m pytest tests/unit/agents/email/test_playground.py tests/unit/agents/email/test_spec_html.py tests/test_email_openapi_conformance.py -q` — 56 pass (route 200, CSP pins egress to `'self'`, no external resources, `/playground` excluded from `/openapi.json`, spec page still self-contained). - [ ] Start the sidecar (`python hub/agents/python/email/packaging/server.py --host 127.0.0.1 --port 8131`), open `http://127.0.0.1:8131/v1/email/playground`: - Stack health shows ✓ Sidecar; the Lemonade/model row diagnoses correctly (start/stop `lemonade-server serve` to see both states). - Triage + Draft run live (with Lemonade up); "Run readiness check · /v1/init" shows the graceful 404 message on this branch. - Response header includes `Content-Security-Policy: connect-src 'self'`. - [ ] Open `http://127.0.0.1:8131/v1/email/spec` — renders in GAIA dark+gold, lists the playground under "Convenience pages".
…eck (bump 0.2.0) (amd#1822) ## Why this matters The email package's version lived in eight files of six different types (Python, YAML, TOML, JSON, Markdown, HTML) with no tool to keep them in sync, so references drifted silently. On `main` right now, `binaries.lock.json` still pins both `agentVersion` and `baseUrl` to `…/agents/email/0.1.0` while every other file already says `0.2.0` — a static pointer to a *prior* hub deployment that no test caught. After this change, `AGENT_VERSION` in `version.py` is the one source of truth, a stamp script syncs every other reference from it, and a `--check` mode fails the build loudly on any mismatch — so a stale version reference can never ship again. Mirrors the Agent UI's existing pattern (`installer/version/bump-ui-version.mjs`: one source → stamps dependents → `--check` gated in CI). **Stamped file types** (all driven from `AGENT_VERSION`): the YAML manifest, `pyproject.toml`, npm `package.json`, the lock's `agentVersion` + `baseUrl`, the two README image URLs, and the `architecture.html` version badge. `API_VERSION` (the REST/contract version) is deliberately **not** touched — it's the contract version, independent of the package build version. **Cross-branch skip-with-warning:** three npm-side targets (README image URLs, `assets/architecture.html`) don't exist on `main` yet — they live on in-flight branches (amd#1776, amd#1814). The script **skips them with a warning** rather than failing, so it works across the partial state today and will stamp them correctly once those branches merge. This PR only touches version strings + the new script + the two workflows + the new test; it does not touch the playground HTML, the `/v1/email/init` endpoint, or the npm client (owned by amd#1814/amd#1813/amd#1776). This PR also fixes the existing `binaries.lock.json` drift (0.1.0 → 0.2.0) as the first run of the new stamper. ## Test plan - [x] `python hub/agents/python/email/packaging/stamp_version.py --check` passes on the post-bump tree (exit 0) - [x] Mutating any target to a wrong version makes `--check` exit non-zero (covered by `test_stamp_version.py`) - [x] `python -m pytest hub/agents/python/email/tests/test_stamp_version.py` — 10 passed (hermetic, no network) - [x] Version-contract tests green with the bump: `test_agent_version_matches_package_export` + `test_agent_version_matches_package_metadata` (pyproject + in-code `AGENT_VERSION` both 0.2.0) - [x] `black` + `isort` clean on the new files - [x] `--check` wired into `release_agent_email.yml` (before publish) and `test_email_agent_unit.yml` (early PR drift gate; npm-side paths added to its triggers)
Why this matters
A fresh host passed
/healthand/versiongreen even with Lemonade down and no model downloaded — then 502'd on the very first triage call. Every readiness signal said "ready" while the stack couldn't actually triage.GET /v1/email/initcloses that trap: it probes the whole triage stack (Lemonade reachable + the triage model downloaded) and returns a structured status — 200 when ready, 503 when not, with an actionablehint— so an integrator (and the npm package'sstartSidecar) can verify "ready to triage," not just "process up."Read-only by design: probes only, no model pull or provisioning (a deferred follow-up).
loadableisnullin v1 — forcing a load is heavy, sopresent(a cheap model-list lookup) is the readiness signal. The reachability probe reuses the existing short-timeout/healthlogic (#1677) via a shared base-URL resolver, so "Lemonade down" fails fast instead of hanging on the OS SYN timeout. Failures are loud: even when Lemonade answers/healthbut its model list can't be read, the endpoint returns 503 with a specific hint rather than silently reporting "absent."Scope: Python sidecar + its tests/docs only — the npm client wrapper and playground are handled separately.
Test plan
pytest tests/unit/agents/email/test_init_endpoint.py— 16 tests: probe call-shape at the boundary (URL suffix, short timeout, auth header), Lemonade-down → 503 + hint, model-missing → 503 + hint, model-list-unreadable → 503 + hint, ready → 200, sidecar mount viapackaging/server.pybuild_app().pytest tests/test_email_openapi_conformance.py hub/agents/python/email/tests/test_rest_contract.py— running-server conformance (200/503) + committedopenapi.email.jsonis drift-free.pytest tests/unit/agents/email/test_spec_html.py— runtime/v1/email/specpage documents the new endpoint.python -m gaia_agent_email.export_openapi --check—/v1/email/initpresent, artifact up to date.python util/lint.py --black --isort --flake8on the changed files — clean.Docs synced: runtime
/v1/email/specpage (spec_html.py), the hand-maintainedspecification.html(new#ep-initblock + 503 row), and the regeneratedopenapi.email.json.